Lecture � Curious machines

Greg Detre

Wednesday, April 23, 2003

Presentation � Andrea, Schaal (1999)

Motivation

imitation learning � like reinforcement learning, but not combinatorial explosion � so that we don�t have to pre-program every single task

What is imitation?

increased tendency to execute demonstrated behaviour

true imitation: enw behaviour, same task strategy + goal

other useful social learning:

emulation � direct attention to favourable goals

priming � bias exploration to useful stimuli

Cogsci/neuroscience

Perrett: STs could extract attention and goals of others, insensitive to self

Gallese/Goldman: MNs used for �mind reading�

Architecture

Two problems:

parsing perception for what to imitate

deciding how to imitate: the sensory motor map problem

Demiris/Hayes � passive vs active imitation

passive: perceive-recognise-act

imitate known behaviours only

assumes percept-motor mapping

e.g. head postures

active imitation: �not imitating because you understand, but understanding because you are imitating�

behaviour/forward model pair

behaviour: maps task/goal � action

forward model: maps action � next state

Schaal architecture

true imitation + response facilitation � use same circuitry

assumption: state + action of teacher are directly observable and identifiable

Perception problem

ways/heuristics to guide attention in the feature space:

markers, tags, 3d motion capture

distinctive colour

fixed attention criteria, e.g. �pay attentiont o red objects when looking for apples�

affective qualities, e.g. Cog attend to bright colours when bored, skin colours when lonely

Mapping problem

what�s the matching criterion? what�s the coordinate frame?

Social learning implementation examples

direct policy learning: no need for student to know the goal

learning by demonstration

learning by imitation

perception of teacher and self � same mechanism

given task goal, robot arm learns task-level policy by reinforcement learning

apparently learned pole-balancing in a single trial!

perhaps by running the forward model (in place of experience) thousands of time, and uses reinforcement learning

true imitation

learning both novel goal and task

Open research questions

representations of vision, self etc. develop along with motor

basic set of motor primitives

how acquire + map the demonstrator�s goal?

can we get a robot to imitate the goal rather than the action?

need ToM???

why not take advantage of the two-way communication of social learning?

eh???

it�s situated learning, involves interaction, rather than just demonstration

Discussion

Breazeal � imitation was starting to become a hot area

Schaal � computational neuroscience

liked the notion that MNs apply to the motor primitives

makes sense to use the same representation for generation and recognition � because you can�t perform actions that you can�t recognise, and vice versa, right???

movement primitives as corresponding to a trajectory through pose space � nodes as via-points???

you want task goal and task strategy to be continuous, don�t you???

need to tie visual and motor representations � e.g. geons

if you�re trying to imitate picking up a cup (whether standing or sitting), you want to pay attention to just a portion of the pose-graph

need to break the pose-graph down from whole-body to body-parts

sometimes the static parts are as important

maybe you could have an encoding to say �keep this bit still�

function vs form � what you�re trying to do, and how you go about doing it

continuous path between those

it�s actually asking too much of the robots to manage true imitation � we can�t learn a roundhouse kick from one example

especially when there�s an object involved, or it needs (tactile feedback), e.g. when twiddling a pen or even the proprioceptive feedback from your own body

you need a means of combining movement primitives � to form movement complex

would a MN fire if you were to move to pick up a glass and make the motion but just not quite grip the cup as you make the picking-up motion???

can you incorporate goals into forward models??? you can definitely incorporate a larger input feature vector (that incorporates some of the external world as well, e.g. whether there�s actually a glass there to pick up), but do they constitute goals???

why are people into imitation???

Breazeal:

skill transfer

they�re not interested yet in social cognition

the goals are pre-specified � that�s why there�s no interesting research into goal hierarchies

there�s a big problem then of how to communicate your goal, right???

Presentation � Jesse, Schaal (2003)

don�t use hidden variables (e.g. teacher�s state)

Deb: argues that you need a kind of goal lattice, and that language allows you to shortcut through, and be guided higher or lower (depending on whether the why or how is more important)

because it�s a lattice, there are always multiple ancestors, and so ambiguity � if it was just a hierarchy, it would be easy to imitate, because there�d only be one path up the hierarchy to explain why

you have to see the imitationy ideas this paper as the first step towards building body knowledge

when does it have body knowledge then?

when you can utilise your forward models to generate new behaviours or skills or adapt to new tasks quickly and well, then you�re getting there

use statistical information about the animation to acquire knowledge about the mean joint positions and joint limits

IK??? inverse kinematics

discussion of how language is so necessary/valuable in focusing the search space, what you�re doing wrong etc.